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1. Introduction 



CIh. 

(-H , This paper develops general Berry-Esseen bounds for the exponential distribution using Stein's 

method. Two of our main results are given by the following statements. We let I[A] denote the 
indicator function of an event A. 



Theorem 1.1. Assume that W and W are non-negative random variables on the same probability 
psj ■ space such that C{W') = C{W). Then, if Z r-^ Exp(l), we have for any t > and any constant 

>'. A > 

in' |P[T^ < t] -P[Z < t]| < E\{X-^E{D\W) + l)I[W > 0]\+E\^E{D^\W) -l\ 

+^E\Df + ^K {DH[\W-t\<\D\]), 

where D := W - W. 



Theorem 1.2. Assume that W and W' are non-negative random variables on the same probability 
space such that C{W') = C{W) and 

E{D\W) = -X{W-1), 
where X > is a fixed constant. Then if Z ^ Exp(l), we have for any t > 0, 



,,,,,, EViXW -E{D'^\W)\ E|Z)|3max{f-i,2i-2} 



e{dH^w -t\ < 

^ xt ■ 

where D:=W' - W. 

The use of a pair {W, W) is similar to the exchangeable pairs approach of Stein for normal 
approximation [Stlj . but in the spirit of [Ro| . throughout this paper we require only the weaker 
assumption that W and W' have the same law. It can be challenging to obtain good bounds on 
the error terms in Theorems ll.ll and ll.2l and we also develop a number of tools for doing that. 

Before continuing, we mention that this is not the first paper to study exponential approximation 
by Stein's method. Indeed, earlier works, in the more general context of chi-squared approximation, 
include Mann [Mn], Luk [Luj . and Reinert [Rej (which also includes a discussion of unpublished 
work of Pickett). The paper [Mn] uses exchangeable pairs, whereas |Lu] and |Re| use the generator 
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approach to Stein's method. However all of these papers focus on approximating expectations of 
smooth functions of W, rather than indicator functions of intervals, and so do not give Berry- 
Esseen theorems. Moreover, the examples in [Lu] and [Re] are about sums of independent random 
variables, whereas our example involves dependence. 

Our main example is the spectrum of the Bernoulli-Laplace Markov chain. This Markov chain 
was suggested as a model of diffusion and has the following description. Let n be even. There are 
two urns, the first containing ^ white balls, and the second containing ^ black balls. At each stage, 
a ball is picked at random from each urn and the two are switched. Diaconis and Shahshahani 
[DS] proved that ^ log(n) + ^ steps suffice for this process to reach equilibrium, in the sense that 
the total variation distance to the stationary distribution is at most ae~'^'^ for positive universal 
constants a and d. In order to prove this, they used the fact that the spectrum of the Markov 
chain consists of the numbers 1 — ^^^^/^•^P occurring with multiplicity (") — {^^i) for 1 < i < § 
and multiplicity 1 if i = 0. Hora proved the following result, which shows that the spectrum of the 
Bernoulli-Laplace chain has an exponential limit. 

Theorem 1.3. /^ |Hol] ) Consider the uniform measure on the set of the (n) eigenvalues of the 

Bernoulli-Laplace Markov chain. Let t be a random eigenvalue chosen from this measure. Then 
as n ^ oo, the random variable W := -|- 1 converges in distribution to an exponential random 
variable with mean 1. 

As an application of our general Berry-Esseen bound, the following result will be proved. 
Theorem 1.4. Let Z ~ Exp(l), and let W be as in Theorem M.SX Then 

\^{W < t} -P{Z < t}\ < 

for all t, where C is a universal constant. Moreover this rate is sharp in the sense that there is a 
sequence of n 's tending to infinity, and corresponding tn 's such that 

< tn) - P(Z < t„)| = ^ + 0(l/n). 

Note that the Bernoulli-Laplace Markov chain is equivalent to random walk on the Johnson 
graph J(n, k) where k = ^. The vertices consist of all size k subsets of {1, • • • , n}, and two subsets 
are connected by an edge if they differ in exactly one element. From a given vertex, random walk 
on the Johnson graph picks a neighbor uniformly at random, and moves there. 

One reason why our method for proving Theorem 11.41 is of interest (despite the existence of a 
more elementary argument for a weaker version of Theorem 11.41 sketched at the end of Section [4]) is 
that Theorem 11.41 is in fact a small piece of a much larger program. To explain, limit theorems for 
graph spectra (especially Cayley graphs and finite symmetric spaces) have been studied by many 
authors and from various perspectives; some references are |Hol| . |Ho2] . |Ke] . |F1| . |F2] . |F3| . |F4] . 
|Sn| . |ShSu| . [Tlj . and |T2] . In particular, the references |T1| and |T2| describe some challenging 
conjectures where the limit distribution is the semicircle law and relate them to deep work in 
number theory. With the long-term goal of making progress on these conjectures, the paper |F4] 
gave some general constructions for applying Stein's method to study graph spectra, and worked 
out examples in the case of normal approximation. The current paper works out an exponential 
example, and is excellent evidence that these constructions will prove useful in other settings where 
the spectrum has a non-normal limit. We also emphasize that while there are papers such as ^GTj 
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which obtain non-normal Hmit theorems with an error term in spectral problems, they study the 
spectrum of random objects, whereas our work, and the conjectures of |Tlj . |T2j . all pertain to the 
spectrum of a sequence of fixed, non-random graphs. 

We also mention that an additional reason for studying the spectrum of the Bernoulli-Laplace 
chain is that it is closely related to the spectrum of the random transposition walk (and so with 
representation theory of the symmetric group). Indeed, from jS^, the eigenvalues of the Bernoulli- 
Laplace chain can be expressed as ^^""^^ /i — ^-"""^^ as /x ranges over a subset of eigenvalues of the 
random transposition walk. This relationship is not surprising given that the Bernoulli-Laplace 
chain transposes balls from different urns at each step. But together with the large body of work 
on Kerov's central limit theorem for the spectrum of the random transposition walk ( [Kej . |F2j . 
|F3| . [F4| . [Sn], [10], [Ho2j), it does make the problems studied in the current paper very natural. 
As a final justification for the current paper, we believe that the example in it will serve as a useful 
testing ground for other researchers in Stein's method (certainly it helped us in developing our 
Berry-Esseen theorems). 

The organization of this paper is as follows. Section [2] proves our first general Berry-Esseen 
bound for the exponential law, namely Theorem 11.11 above, and develops tools for analyzing the 
error terms which appear in it. Section [3] proves our second general Berry-Esseen bound for the 
exponential law, namely Theorem 11.21 above, and develops tools paralleling those in Section [2] for 
analyzing the error terms. Section [J] treats our main example (spectrum of the Bernoulli-Laplace 
chain), proving Theorem 11.41 An interesting feature of the proof is that it uses theory from both 
of Sections [2] and El to treat the cases of small and large t respectively. Finally, Appendix Rl gives 
an algebraic approach to the exchangeable pair and moment computations in Section IU linking it 
with the constructions of [F4j . This is not essential to the proofs of any of the results in the main 
body of the paper, but does motivate the exchangeable pairs used in the paper, which could be 
difficult to guess. 

2. Berry-Esseen Bound for the Exponential Law: Version 1 

A main purpose of this section is to prove Theorem 11.11 from the introduction, and to develop 
tools for analyzing the error terms which arise in it. To begin we make some remarks concerning 
the statement of Theorem 11.11 

Remarks: 

(1) In our main example (see Section |4]), the relation E(Z)|VF) = —A is satisfied for all W > 0. 
Hence the first error term in Theorem 1 1 . 1 1 will vanish. In the spirit of [RR] . one could also 
have that E(D|VF) = —A + R for some non-trivial random variable R. 

(2) Although W is allowed to attain the value (and does, in our main example), the conditional 
expectation £(1)111^ = 0) (i.e. the "drift" at zero) does not enter in the first term of the 
bound. 

(3) In our main example (see Section [4]), E(L'^|iy) = 2A and so the second error term also 
vanishes. The third error term E|Dp can be bounded using the Cauchy-Schwarz inequality 
E|Dp < Y^E|DpE|Z)|^. The error term which is difficult to bound in practice is the fourth 
error term, and later in this section we develop suitable tools (see Theorems 12.21 and 12. 3p . 

Before embarking on the proof of Theorem 11.11 we recall the main idea of Stein's method in 
our context. As observed by Stein |St2j . a random variable Z on [0,oo) is Exp(l) if and only if 
K[f'(Z) — f{Z)] = — /(0+) for all functions / in a large class of functions (whose precise definition 



we do not need). Here /(O"'") is the limiting value of /(a) as a approaches from the right. Stein's 
characterization of the exponential distribution motivates the study of the function f{x) solving 
the equation 

f{x)- f{x) = I[x<t]-{l-e~'), x>{). 

Indeed, for this / one has that 

P(Ty <t)- F{Z <t)= E[f'{W) - f{W)], 

and the problem becomes that of bounding E[/'(T^) — /(VF)]. 
We begin with the following lemma. 

Lemma 2.1. For every t > 0, the function 

(1) /(x):=e^(*--)+-e-*, x > 0, 

(where in ^ we define the derivative f'{t) := f'{t^)), satisfies the differential equation 

(2) f'{x) - fix) = l[x <t]-{l- e-*), x>Q, 
and the hounds 

(3) ll/lloo<l> ||/1L^1' sup |/'(x)-/'(y)|<l. 

x,y>Q 

The second derivative f", defined for every x ^t, satisfies 
(4) 

Proof. Write 



(4) sup|r(x)|<i. 



1 — \i X > t. 



Together with the definition of f'{t) this yields 

fe-*+^ ifa;<t, 



[0 ifx>t. 
fix) - fix) = !-(!- e-*) 



fix) 

Thus, on X < t, 

which is ([2]), and on x > t 

fix) - fix) = - (1 - e-*) 

which again is ([2]). The bounds ^ and ^ are straightforward; to obtain the last bound in ([3]) 
note that /' is non-negative. □ 



Now we give a proof of Theorem 11.11 

Proof of Theorem 11.11 Using ^ it is clear that we only need to bound EifiW) - fiW)). 
Fix t > and let := Jq fiy)dy. By Taylor expansion, 

= E (F(Ty') - FiW)) 
(5) = E iDfiW)) +e(^D^ (1 - s)fiW + sD)ds 

= E {DfiW) + ^D^fiW)) + E iD^j) 



where ^ 

J:= [ {l-s)if'iW + sD)-f'{W))ds. 
Jo 

Let A be the event that W AW' <t<WVW'. On we thus have for every < s < 1 

(6) \f'{W + sD) - f'{W)\ < s \D\ 
by whereas on A we have 

(7) \f'(^w + sD)- f'iW)\<l 

by ©. Dividing §1 hy X and noting that /(O) = 0, and thus f{W) = I[W > 0]f{W), we can use 
this to obtain that 

W'iW) - fiW)) = Eif'iW) - fiW)) {F{W') - FiW))) 

(8) = {{{mm + l)f{W)I[W > 0]) 

+ E{{l-^E{D'\W))f'{W)) 

- iE (I[^^]L>V) - iE {I[A]D^J) . 
Invoking the bounds ([6|) and ([7]), we have 

I[A'']D'^\J\ < l\Df ,I[A]D'^ \J\ < ^DH[\W-t\ < \D\], 

where the second inequahty uses the fact that A imphes |VF — t| < \D\. Combining these bounds 
with dH]) and the bounds ||/||oo,||/'||oo < 1 from Lemma [XT] completes the proof. □ 

The quantity that is difficult to bound in practice when applying Theorem 11.11 is 

E{dH[\W -t\ < \D\]) . 
One tool which is useful for bounding this quantity is the following theorem. 

Theorem 2.2. Assume that W and W' are real valued random variables on the same probability 
space such that C{W') = C{W). Let D = W' -W. Then for anytGR and c> 0, 

E{dH{\W -t\ < \D\}) < 4cE|E(D|W^)| + E{dH{\D\ > c}). 

However Theorem 12.21 does not always give good bounds. The next result, though more demand- 
ing, can lead to sharper bounds. 

Theorem 2.3. Assume that W and W' are non-negative random variables on the same probability 
space such that C{W') = C{W); let D = W' — W. Then for any positive constants t, ki, k2, Ki, 
K2 and K3 ( where k2 < ki and K2 < K3) we have 

E {dH[\W -t\<\D\]}<k2 + k^e2 + ei + 

A3 - K2 

I k2 ■ Hki/k2) + y^mhh + 2Kit^''^ki + AKi y^hk^ + 4Ki{tk2kl)^/'^ ] 



where 



ei := E {E{D^\W) ■ I [E{D^\W) > ki or E{D^\W) > k2iW + t)] } 
62 := P [E{D^\W) < K3 or E(Z)^|VF) > KfK2W] 



The following lemma will be used in the proofs of both Theorems 12.21 and 12.31 

Lemma 2.4. Suppose that W and W are random variables on the same probability space such 
that C{W') = C{W); set D = W' -W. Then, for any a < b £ R and K > 0, 

E {DH[a <W <b,\D\<K]) <{b-a + 2K)E \E{D\W)\ 

Proof. Define 

'-^{b-a)-K ifx<a-K, 
h{x) = < X - ^{a + b) ifa-K<x<b + K, 
^^{b-a)+K \ix>b + K. 
and H{x) := Jq h{t)dt. Observe that for any < s < 1, 

l[a <W <b,\D\ < K] <l[a - K <W + sD < b + K] 

= h'{W + sD), 

and that 

(9) \\h\\^ = \{b-a)+K. 

Using Taylor expansion we have 

= EH{W') - EH{W) 

^^^^ = E{Dh{W)) +e(^D^ (1 - s)h'{W + sD)ds^ , 

and thus 

E{DH[a <W <b,\D\< K]) 



2E (^D^ <W <b,\D\< K]ds 

2E (^D^ _^ (1 - s)h'{W + sD)d6^ 



< 

= -2E{Dh{W)) [by (fTO]l] 

<2\E{E{D\W)h{W))\ 

<2\\h\\^E\E{D\W)\ 

which together with ([9]) proves the claim. □ 

As the following argument shows, Theorem 12.21 is a straightforward consequence of Lemma |2.4[ 
Proof of Theorem 12.21 Clearly 

E{DH{\W - t\ < \D\}) = E{dH{\W -t\ <\D\,\D\ > c}) 

+ E{dH{\W -t\ < \D\,\D\ < c}). 

The first term is at most E{DH{\D\ > c}) . To upper bound the second term, note that if |Ty — 1| < 
\D\ < c, then a < W < b where a = t — c and b = t + c. Hence Lemma 12.41 gives that 

E(L|2l{|W^ - i| < \D\,\D\ < c}) < 4:cE\E{D\W)\. 

□ 



We close this section by proving Theorem 12.31 
Proof of Theorem 12.31 Define 

B{W) := I \e[{DH[\D\ > KiW^/'^])\W] < K2, Ep^l VF) > K3 
Now note that 

E[{DH[D^ > KfW])\W] < 

From this it is easy to see that 

E(l - B{W)) < P [E{D^\W) < K-i or E{D^\W) > KIK2W] = eg 

Note that if B{W) = 1 then 

E[{DH[\D\ < KiW^/^])\W] = E{D'^\W) - E[{DH[\D\ > KiW^/'^])\W] 

>K3- K2. 

Thus, 

F[a<W <b]<E {I[a <W < b]B{W)} + eg 

= <W< b]B{W)] + 62 

|^A3-A2 J 

1 „ f _ „, 1/0,1 

(11) 



K3 




K2 




1 




K3 




K2 




1 




K3 




K2 



where the last inequality is due to Lemma [2^ 
Now, define 

A{W) ■.= l\E{D'^\W) < kuEiD'^lW) < k2{W + t)] 

Then, 

E{D^{1-A{W))} 

= E {E{D^\W) ■ I [E{D^\W) > ki or E{D'^\W) > k2{W + t)] } 
= ei. 

It follows that 

E{dH[\W -t\ < \D\]} 

< E {dH[{W - tf < D^]A{W)} + ei 
<E{mm{D^,D^{W -t)-^}A{W)} +ei 

< E {mm{E{D'^\W),E{D^\W){W - t)-'^}A{W)} + d 
<E{mm{ki,k2{W + t){W -t)-'^}} +ei 
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Now, 



E {mm{ki,k2{W + t){W - t)-^}} 

) 

F[ki > X, k2{W + t){W - ty^ > x]dx 



<k2 + 



2{W + t){W -t)~' > x]dx 

ki 

-2 



'>[k2{W + t)(W -t)-^ > x]dx 



2 



Suppose that k2{W + t){W — t) > x. Then, solving the equation k2{w + t){w — t) = x, one has 



that 



W e 



c 



t + 



2x 



2tk2 ki k2 

— - + ir2^t + ir + 

X 4a;^ Ix 



2tk2 k^ 

X 



2tk2 k2 2tk2 
,t + — + \ 

X X \ X 



Thus, combining this with the concentration inequahty JTT 
E{dH[\W -t\ < \D\]] 

'i2{W + t){W -t)-^ > x]dx 



k2 
ki 

k2 



<k2 + ei + 
<k2 + ei + 

<k2 + kie2 + ei + 

< k2 + kie2 + ei + 



2tk2 k2 2tk2 

— - <W<t + — +' ^ 



dx 



E|E(Z)|Ty)| 
K^-K2 Jk, 

E|E(Z)|t^)| r^i 
K^-K2 Jk, 



X 



8tko 



+ 2Ki\t + — + 



2tko 



1/2N 



dx 



' ' +2K,ty'+2K J ^ + 2K, ' 2*^^' '^'^ 



A;9 _^ 8tk 

X \ X 



X 



dx 



K\E{D\W)\ 
A3 - K2 

^A:2(lnA:i - InAja) + ^/mk^ + 2Kit^/^ki + AKi^/hh + ^Ki{2tk2)^^^kl^^^ . 

This proves the claim. □ 

3. Berry-Esseen Bound for the Exponential Law: Version 2 

A main goal of this section is to prove Theorem 11.21 from the introduction, and to develop tools 
for analyzing the error terms which appear in it. In particular, the third term can be hard to 
bound. One way to bound it is to apply Theorem 12.21 from Section [2j Another way it to use the 
following more demanding result, which is analogous to Theorem 12.31 from Section [2l 



Theorem 3.1. Let W and W' be non-negative random variables on the same probability space such 
thatC{W') =C{W). Suppose that¥.{D\W) = -\{W -I), where D = W -W and\>^ is a fixed 
constant. Then for any t > and k > 

E{dH{\W -t\< \D\}) < IGA^K^ + WAOX^/'^K\W - llnVi 

+ 8Ae2(|t)t + ei(t) 

where ei and €2 are functions defined on (0, 00) as 

ei(t) := E[E{D'^\W)I{E{D^\W) > 2X{W + t) or E(D^|TF) > AX^{k^W'^ + n'^t'^)}] 

and 

e2(t) := F{K{D^\W) < 2X{W - It) or E{D^\W) > A\^{kV + K^t^)}. 

Moreover, the above bound holds if the assumption of positivity of W is replaced by the assumption 
that W is non-negative and assumes only finitely many values. 

Remarks: 

(1) The idea behind the formulation of Theorem l3.1l is the following: in many problems, we have 
E(D^|M^) < 4A^(K^iy^ + ry) where k is some constant and ry is a negligible term (possibly 
random) . 

(2) The random variable W in the example of this paper can assume the value with positive 
probability. 

It is easy to check by integration by parts that if a random variable Z on [0, 00) is Exp(l), then 
E[Zf'(Z) — {Z — 1)/(Z)] = for well behaved functions /. This motivates the study of the solution 
f[x) to the equation 

xf\x) - (x - \)f{x) = I{x < i} - (1 - e"*), a; > 0. 

Indeed, for such / one has that 

<t)- F{Z <t) = E[Wf'{W) -{W- l)f{W)], 

and the problem becomes that of bounding 

E[Wf'{W)-{W -l)f{W)]. 

Remark: Earlier authors (Mann [Mnj . Luk [Luj . Pickett and Reinert [Re]) studied solutions of 
the equation 

/•oo 

xf"{x) - (x - l)f'{x) = h{x) - / e~^/i(x), 

for functions h whose first k derivatives are bounded. This is complementary to our work, since 
our primary interest is in the function h[x) = Ijx < t}, which is not smooth. 

Lemma 3.2. For every t G M, the function 

g g t 

fix) := , a; > 

x 

satisfies the equation 

(12) x/'(x) - (x- l)/(x) = I{x < t} - (1 - e"*), XGM+, 
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where f denotes the left-hand derivative of f. Moreover, one has the bounds 

\\f'\\o.<t-\ \\f"\\oo<max{t-\2t--^}. 

Proof. Clearly, / is infinitely differentiable on ]R^\{t}. The left-hand and right-hand derivatives 
at t exist and are unequal, which is why we let /' denote the left-hand derivative of /. Then for 
< X < t, 



d /g (* ^)— e *\ xe^ — e^ + 1 



dx\ X 



^ ^ ^ ^ dx^ X J x'^ 

which gives 

- (x - l)/(x) = 1 - (1 - e-*) 

Similarly, for x > t, 

(14) /(^) = — f 
which gives 

x/'(x)-(x-l)/(x) = -(l-e-*). 

Thus, the function / is a solution to (fT^ . 

The easiest way to get a uniform bound on /' is perhaps by directly expanding in power series. 
When < X < t, we recall (fT3|) to get 

(15) /(x) = ^' I e-' = e-'Y. 



x2 ^kl(k + 2)' 



This shows that for x G (0, t] 

oo 



< fix) < e-'Ylj^^ = < min{l,t-^}. 



Again, for x > t, we directly see from ([H]) that /'(x) < and 

1 — e~* 

|/'(x)| < 

Combining, we get 

ll/'l|oo<t-^ 

Now, /' is positive in {0,t] and negative in (t, oo). Therefore / attains its maximum at t. It is 
now easy to see that for all x > 0, 



< fix) < < 1. 



Using (jlSp we see that for < x < t 



and for x > t, 

„<y»M.?(i^<H(i_f:!) 

X'^ I'^ 
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Combining, we get, for all x > 0, 

< f"{x) < max{t"\2t"^}. 
This completes the proof. □ 

Now the main results of this section will be proved. 

Proof of Theorem 11.21 Fix t > and consider the Stein equation 

(16) xf'{x) - (x - l)/(x) = I[x <t]- F[Z < t] 

for X > 0, where Z ~ Exp(l). From Lemma 13.21 its solution / satisfies the non-uniform bounds 

(17) II/'IL < t-\ liriL ^ max{t-\2r2}, 

where /" denotes the left derivative of /', as /' has a discontinuity in t. 

Assume first that W is positive. Defining G^w) = f{x)dx, Taylor's expansion gives that 

G{W') = G{W) + Df{W) + [ (1 - s)f'{W + sD) ds. 

Jo 

The hypothesis E{D\W) = -X{W - 1) gives that 
= E{G{W') -G{W)} 

= E{-X{W - l)f{W)} + E |l>2 ^ (1 - s)f'{W + sD) dsj , 

and hence 

E{{W - l)f{W)} = E l^-D^ Io^^~ ^^^'^^ ^ ■ 
Taking expectation on (jl6p with respect to W, we thus have 
F[W <t]- F[Z < t] 

= E{Wf'{W)-{W -l)f{W)} 

= E !^Wf'{W) - jD^ ^)-^'(^ + 

.-^g) =E{[W-D''/{2\))nW)] 



\nw)- j\i-s)nw+sD)ds 



E{{W-D^/{2\)) nW)] 



2 /-l 



+ E <! / (1 - s){f{W) - f'{W + sD))ds 



Note now that for any x,y > 0, 
\f'ix)-ny)\ < 



ll/"lloo 1^ ~ y\ ^ y ™ same side of t, 
2||/'lloo otherwise. 
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Also, if X and y lie on different sides of t, then |x — t| < |x — y|. Thus 

/ {l-s)\f'{W)-f'iW + sD)\ds 
Jo 

^I|/1L/ {l-s)s\D\ds + 2\\f'\\^ [ {l-s)I[\W-t\<\sD\]ds 
Jo Jo 



<-|^lll/"ll ^[\W -t\<\D\]. 

— gi Ml-' lloo M-'lloo LI I — I u 



Putting the steps together we obtain from (fTSl) 



\F[W<t]-F[Z<t]\ < ||/'|L E 



W 



E{D^\W) 



2A 



+^\\f"\LnD\' + \\\f'\LE{DH[\W-t\<\D\]}, 

and with the bounds (jl7p the claim follows for positive W. 

To treat the case where W can also equal 0, choose < 6 < 1 and define Ws := (1 — 5)W + 5, 
:= (1 — 6)W' + 6 and ts ■= (1 — 6)t + (5. One sees that Ws is a positive random variable, and 
that E{Ds\Ws) = -X{Ws - 1) where A is the same as for the pair {W, W). Moreover F{W <t} = 
F{Ws < ts}, so it follows that 

\F{W <t}- F{Z < t}\ < -—E\2XWs - E{Dj\W)\ + ^ '^^ ' ^ E\Dsf 

2Ats 4A 

+ ±-E{Dll{\Ws-ts\<\Ds\}). 

Since Ds = {1 — 6)D, the first two error terms are continuous in 5 and converge to the corresponding 
error terms for W when 6^0. The same is true for the third error term, as can be seen from the 
fact that \Ws — ts\ < l-D^I if and only if |PF — t| < \D\. This completes the proof. □ 

Next, we prove Theorem 13.11 

Proof of Theorem 13.11 First we treat the case that W is always positive. Throughout we 
shall be using V := {2X)~^/^{W' - W) instead of D{= W - W), simply because D occurs with a 
factor of (2A)~^/'^ attached with it on most occasions. 

Suppose for each < s < t, we have numbers u{s, t) and v{s, t) such that whenever s < a < b < t, 
we have 

F{a<W <b} < u{s, t){b - a) + v{s, t). 
Fix t e M. Let A{W) = 1{E{V'^\W) <W + t, E{V'^\W) <ti'^{W + tf}. Then 

E{V'^{1 - A{W))) 

< E{E{V^\W)I{E{V^\W) >W + tov E{V^\W) > + n^t^]) 

=: ei(i). 
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It follows that 

E{VH{\W -t\ < \W' -W\}) 

< EivH{iW' - Wf >{W- tf}A{W)) + ei{t) 

< E(mm{2A(VF - V^}A{W)) + ei{t) 

< E(mm{2A(VF - t)~^E{V'^\W),E{V^\W)}A{W)) + ei{t) 

< E{mm{2XK^{W - t)-^{W + t f, W + t}) + ei{t). 

Now 

E{inm{2XK^{W - t)-^{W + tf, W + t}) 

(19) 



/ P{2Ak2(W^ - t)-^{W + tf >x,W + t>x} dx. 
Jo 



Now take any x > 8Xk . Let c{x) = y — j^- Then the following are easily seen to be equivalent: 
2Xk^{W - t)~^{W + tf>x ^ \W -t\< c{x){W + t) 

1 + c{x) 1 — c{x) 

Let a{x) = (1 — c(x))/(l + c(x)) and b{x) = (1 + c(x))/(l — c{x)). Note that since x > SXk^, 
therefore c{x) < 1/2 and so a{x) > 1/3, b{x) < 3, and 



6(x)-a(x) = ^i^<fc(.). 



Now iiW<3t then W + 1 < At. Thus, the integrand in p9|) is zero for x > At. Combining, we see 
that 

E(min{2A(VF - t)-^{W + ^)^ W + t}) 
rit 



(20) 



pit 

< 8Ak^ + / F{a{x)t <W < b{x)t}dx 

pit 

<SXk^+ / [u{\t,2,t)^c[x)t + v{\t,'it))dx 

< 8Xk^ + 22n(k, 3t)tKV2Xi + Atv{h, 3t). 



Next, we proceed to find suitable values of u{s,t) and v{s,t). Fix 0<s<a<b<t. Let 
5(W") = I{E(y2][{|y| > 2KVW + a}\W) < i(VF + a), E(y2|VF) > - ia}. 
Now note that 

E{vH{\v\ > 2.VwT^}\w) < Jpf^. 

From this it is easy to see that 

E(l - B{W)) < ¥{E{V^\W) <W-laov E(y^|VF) > kV + K^a^} 
=: 62(0) 
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Note that if = 1 then E(y2][{|y| < 2kVW + a}\W) > lW-\a. So, if TV >a&ndB{W) = 1, 

E(y2][{|y| < 2t^^/W + a}\W) > ^a. Thus, 

aF{a <W <b}< AE{vH{a <W <b, \V\ < 2kVW + a}) + ae2{a) 
< AE{vH{a <W <b, \V\< 2KVbT^}) + 063(0) 
= 2X-^E{DH{a <W <b, \D\ < 2K^/2X{b + a)}) + 062(0). 
where D = W' — W. Using Lemma l2.4| we get 

aP{o <W <b} <2{b-a + 4Ky/2X{b + a))E\W - 1| + 062(0). 
Finally, note that 62 is a monotonically decreasing function. Thus, we can take 

, , 2E\W-1\ 

u[s,t) = 

s 

and 



, , IGnVXtElW -1\ 

v{s,t) = h 62(5). 

s 

Using these expressions for u and v in (j20p . we get 

E(y2]i{|VF - t| < \W' - W\}) < 8Ak2 + 520E|W^ - 1|K\/At + 462(it)t + 6i(t). 

Put ei(t) = 2A6i(t) and 62 (t) = 62 (t) to get the final expression in Theorem 13. 1[ 

Finally, suppose that W might take the value 0, but that W assumes only finitely many values. 
As in the proof of Theorem O for < 5 < 1 define Ws := (1 - 5)W + 6, := (1 - 6)W' + 6 
and ts := (1 - 6)t + 6. Since \Ws - ts\ < \Ds\ if and only if \W - t\ < \D\, it follows that 
E{DH{\W -t\< \D\}) is the limit as ^ of E{D1I{\Ws - ts\ < \Ds\}). It is easily checked that 
E{Dj\W) > 2X{Ws + ts) implies that E{D^\W) > 2X{W + t) and that E{Dj\W) > AX'^{k^W^ + 
K^tj) implies that E(L>4|TF) > AX'^{k^W^ + kH^). We claim that E{Dj\W) < 2X{Ws - jts) 
implies that E(L'^|PF) < 2X{W — jt) provided that 6 is sufficiently small. Indeed, since W takes 
only finitely many values, there is an > such that E(D'^|M^) < 2A(iy — jt) if and only if 
E{D'^\W) < 2X{W - It) + mt. The claim now follows since E{D1\W) < 2X{Ws - \ts) implies that 
E{D^\W) < 2X{W - it) + 2^4^ + 6E{D^\W). Hence the theorem follows by letting S ^ 0. □ 

4. Example: Spectrum of Bernoulli-Laplace chain 

This section proves Theorem 11.41 of the introduction. Throughout we let W denote the random 
variable defined by 

Wii) := {n-2^){n + 2-2^) ^ 

where n is even and i G {0, 1, • • • , §} is chosen with probability -7r(z) equal to 

- f " 1 1 
111 ^^-1^ if 1 < i < H if i = 

\n/2) \n/2) 

Letting Z be an Exp(l) random variable and C a universal constant, the upper bound 

C 

W'iW <t)- F(Z <t)\<^ 

\/n 
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will be proved in two steps. Subsection 14.11 uses the machinery of Section [2] to treat the case that 
t < 1, and Subsection 14.21 uses the machinery of Section [3] to treat the case that t > 1. One 
interesting feature of the proof is that the exchangeable pairs used in these two subsections are 
different (but closely related). We also show (in Subsection 14.11) . that combining the machinery 
of Section [2] with a concentration inequality, one can obtain, with less effort, a slightly weaker 
Q^ log(n) ^ppgj. bound. 

Finally, Subsection 14.31 shows that the 0(n~^/^) rate is sharp, by constructing a sequence of n's 
tending to infinity and corresponding t^'s such that 

2e-2 

\nWn < tn) - nZ < tn)\ + 0{l/n). 



n 

4.1. Upper bound for small t. The purpose of this subsection is to use the machinery of Section 
[2] to prove Proposition 14. H which implies the upper bound of Theorem 11.41 of the introduction for 
t < 1. 

Proposition 4.1. 

, C • max(l,t^/^) 

\¥{W < t)-¥{Z <t)\< ^ ' ^ 



for a universal constant C . 

To begin we define an exchangeable pair {W, W) and perform some computations with it. The 
definition of {W, W) and the fact that the computations work out so neatly may seem unmotivated. 
There is an algebraic motivation for our choices, and so as not to interrupt our self-contained 
probabilistic treatment, we explain this in the appendix. 

To construct an exchangeable pair {W, W), we specify a Markov chain K on the set {0, 1, • • • , ^} 
which is reversible with respect to tt. This means that 7r(i)K{i,j) = Tr(j)K{j,i) for all Given 
such a Markov chain K, one obtains the pair (W,VF') in the usual way (see for instance |RR| ) : 
choose i from vr, let W = W{i), and let W = W{j), where j is obtained from i by taking one step 
using the Markov chain K. 

The Markov chain which turns out to be useful is a birth-death chain on {0, 1, • • • , ^} where the 
transition probabilities are 

n — i + 1 

K{i,i + 1) 



K(iJ-l) 



n{n - 2i){n - 2i + 1) 
i 



n{n -2i + l)(n -2i + 2) 

K{i, i) :=1- K{i, i + 1) - K{i, 

with the exception of K(i, i + 1) li i = n/2, which we define to be zero. 

It is easily checked that K is reversible with respect to vr, so the resulting pair (W, W) is 
exchangeable. (In fact the machinery of Section [2] only uses that W and W have the same law, 
which follows from the fact that K has vr as a stationary distribution, but the exchangeability is 
good to record). 

Lemma 14.21 performs some moment computations related to the pair (W,PF'). 

Lemma 4.2. Letting D := W — W, one has that: 
(1) E{D\W) = -^ifW^ 0; E{D\W = 0) = i. 
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(2) E{W) = 1. 

(3) E{D^\W) = 4,. 

(4) E{D^\W) = {§-^)W + ^. 

(5) E{D^) = ^. 



Proof. Since i is determined by W{i), conditional expectations given W can be computed using 
conditional expectations given i. Supposing that i ^ n/2, 

E{D\i) = K{i,i + l){W{i + l)-W{i)) + K{i,i-l){W{i-l)-W{i)) 

_ n-i + l 2{2i - n) ^ i 2{n - 2i + 2) 



n(n — 2i)(n — 2i + 1) n n(n — 2i + l)(n — 2i + 2) n 

2 

11? 

If i = n/2, then E{D\i) = K{i,i- l){W{i - 1) - Wii)) = i, so part 1 is proved. 

For part 2, argue as in part 1 (separately treating the cases i / n/2 and i = n/2) to compute 
that eId^\W) = -^{W - 1). Since W and W are exchangeable, E{D^) = 0. Thus 

3 3 

E{W - 1) = -^E[E{D'\W)] = -^E(Z)3) = 0, 
lb lb 

so E{W) = 1. 

For parts 3 and 4, one argues as in part 1 to compute both sides (separately treating the cases 
i 7^ n/2 and i = n/2) and checks that they are equal. For part 5, note that 

E{D^) = E[E{D^\W)] = (-,-^) E[W] + ^ = -„ 



yii^ J n^ n?' 

where the final equality is part 2. □ 

Using these moment computations, we deduce Proposition 14.11 

Proof of Proposition 14. II We apply Theorem 1 1.1 1 to the pair {W,W') with the value \ = 
Then the first two error terms actually vanish. Indeed, part 1 of Lemma 14.21 gives that 

E\{\-^E{D\W) + l)l[W > 0]| = 0, 

and part 3 of Lemma 14.21 gives that 

E\^E{D'^\W) - 1| = 0. 

To analyze the third error term, use the Cauchy-Schwarz inequality and parts 3 and 4 of Lemma 
to obtain that 




^E\D^\ < -^E{D^)E{D^) -- 
To bound the fourth error term, apply Theorem 12.31 with 

n^ V n n^ n^ 

Note (as required by the theorem), that K2 < and that for n > 12, k2 < ki. From part 1 of 
Lemma IM] and the fact that F{W = 0) = one computes that E|E(L»|VF)| = . 
It is necessary to upper bound 

ei = E {E{D'^\W) ■ I [E{D^\W) > ki or E{D^\W) > k2{W + t)]}. 
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Note from part 3 of Lemma 14.21 that 



E{D^\W) > 



and from part 4 of Lemma 14.21 that 



48 



E{D^\W) > -^{W + t 



< 



(21) 



Thus 



48 

E{D*\W) > 

¥{W < 4/(n + 4)) 
F{W = 0) 
2 

n + 2' 



4 

ei < —r{W = 0) 



n2(n + 2)' 
It is also necessary to upper bound 

62 = P [E{D'^\W) < Ks or E{D^\W) > KIK2W] . 

Note from part 3 of Lemma 14.21 that 







and from (HI]) that P[E(L>'^|Vr) > ^W] = Thus 62 = Plugging into Theorem [231 one 

obtains that 

—E{DH{\W-t\<\D\})<— ^' ^ 



for a universal constant C. This completes the proof. 



□ 



To close this subsection, we show how the machinery of Section [21 together with a concentration 
inequality for W' — W, leads to a simpler proof (avoiding the use of Theorem 12. 3p that 



l^iW < t)-F{Z <t)\<C 



log(n) 



n 



for a universal constant C. We hope that this approach will be useful in other settings (a concen- 
tration inequality for W — W can be very useful for normal approximation by Stein's method; see 
the survey [CS]). 

The following lemma is helpful for obtaining a concentration result for W — W. 



Lemma 4.3. Let a be an integer such that < a < ^. Then (n" J / (n) < e" 



a(a-l) 
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Proof. The result is visibly true for a = 0, so suppose that a > 1. Observe that 

(f-g) ^ (§)•••(§ -g + l) 
(|) (§ + l)---(§+a) 

(§)•••(§ -g + l) 

a-l „• 

mi-- 

J- J- n. 

i=l 



< 



n 

= eE-=-iMog(i-f) 

Ea — 1 2i 
a(a~ 1) 



□ 

Proposition 14.41 gives the concentration inequality for W — W. As usual \x\ denotes the smallest 
integer greater than or equal to x. 



+ 1 . 



Proposition 4.4. ¥{\W' -W\> c) < rT^I'^ forc=^(^ ^n\og{n) 

Proof. Since the Markov chain K used to construct {W, W) is a birth death chain, it is easily 
checked from the definition of W that \ W'{i) - W{i)\ < ^(n - 2i + 2) for all i. Thus for c as in the 
proposition, 

¥{\W'-W\>c) < ¥(^{n-2i + 2)> c 

n cn 

i< hi 

2 4 



n 

^<- + l 



-nlog(n) 



+ 1 



From the definition of the probability measure vr, it is clear that for integral a, P(i < ^ + 1 — a) 



(i) 



Hence the proposition follows from Lemma |4.3 



□ 



This leads to the following proposition. 
Proposition 4.5. 



\w < t)-^{Z <t)\<C 



log(n) 



n 



for a universal constant C . 

Proof. As in the proof of Proposition 14.11 apply Theorem 11.11 to the pair (VF, VF') with the value 
A = The first three terms are bounded as in the proof of Proposition 14.11 To bound the fourth 



term, note from Theorem 



that 



1 

—¥.{dH{\W -t\< \D\]) < n'^cE\E{D\W)\ + —E{dH{\D\ > c}) 
2A 4 
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for any c > 0. From part 1 of Lemma 14.21 one computes that E|E(D|VF)| = :^;;(^^^- One checks 
from the definitions that |T^' - M^l < 2 + ^, so that {W - Wf < 16 since n is even. Choosing 
c = ^ ^ ^J'^nAog{n) + 1^ , it follows from Proposition 14.41 that 

E{dH{\D\ > c}) < 16F{\D\ > c) < 16n~^/2. 
This proves the proposition. □ 

4.2. Upper bound for large t. The purpose of this subsection is to apply the machinery of 
Section [3] to prove the following Proposition, which gives the upper bound in Theorem 11.41 in the 
introduction for t > 1. 

Proposition 4.6. 

,™/^^. N M C-max(l,t~^) 
\¥{W <t) -F{Z <t)\ < 



for a universal constant C . 

The pair {W, W) used in this subsection is somewhat different from the pair used in Subsection 
I4.lt for a discussion of the relationship between the two pairs see the remark below. As with the 
pair from Subsection l4.lt the definition and the fact that the computations work out so nicely may 
seem unmotivated. The algebraic motivation for the choices is discussed in the appendix. 

To construct an exchangeable pair iW^ W), we specify a Markov chain K on the set {0, 1, • • • , ^} 
which is reversible with respect to vr (i.e. one has that TT{i)K{i, j) = Tr{j)K{j,i) for all Given 
such a Markov chain K, one obtains the pair {W, W) by choosing i from vr, letting W = W{i), and 
setting W' = W{j), where j is obtained from i by taking one step using the Markov chain K. 

The Markov chain which turns out to be useful is a birth-death chain on {0, 1, • • • , whose 
only non-zero transition probabilities are 

T^C -^n {n-i + l){n-2i) , i{n -2i + 2) 

K(i,i + 1) := , Kit, I — 1) = — r. 

^ ' ^ n{n-2i + l) ' ^ ' ' n{n-2i + l) 

It is easily checked that K is reversible with respect to vr so that (W, W) is exchangeable. (In fact 
the machinery of Section [3] only uses that W and W have the same law) . 

Remark: If K{i,j) denotes the transition probabilities of this subsection, and K{i,j) denotes 
the transition probabilities from Subsection 14.11 one can verify the relation 

~ 4 K{i,j) . 

Letting D = W' — W for the pair of this subsection and D, W the corresponding quantities for the 
pair from Subsection 14.11 it follows that 

for all r. 

Lemma 14.71 performs some moment computations related to the pair (VF, PF'). 

Lemma 4.7. Letting D := W' — W, one has that: 
(1) E{D\W) = -UW-1). 
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(2) E{W) = 1. 

(3) E{D^\W) = ^W-^iW-l). 

(4) Var{W) = 1. 

(5) nD'\W] = f (2W' + IMh:! + Mk^) . 

(6) E[L>4|^] < ^ = 0}. 

Proof. For part 1, by the construction of {W, W') one has that 

E{D\i) = K{i,i + l)[W{i + l)-W{i)] + K{i,i-l)[W{i-l)-W{i)] 
(n + 1 - i)(n - 2i) /^4i _ 2 V ^(?^ - 2i + 2) / _ 4{i - 1) 



77,(n + 1 — 2i) \n J n{n + 1 — 2i) \ n 

Elementary simphfications show that this to equal —^{W{i) — 1). 

For part 2, since W and W have the same law, one has that E(D) = 0. By part 1, 

E{D) = E[E{D\W)] = ~E(W - 1), 

and the result follows. 

For part 3, the construction of {W, W') gives that 

E[D^\i] = K{i,i + l)[W{i + l) -W{t)f + K{i,i-l)[W{i-l) -W{i)f 

^n + l-^)in-2^){f-2)' i{n - 2z + 2) (2-^)' 
n{n + 1 — 2i) n(n + 1 — 2i) 

Part 3 now follows by elementary algebra. 
For part 4, observe that 

E[D^] = E[E[{W' - Wf\W]] 

= E[{W'f] + E{W^) - E[2WE{W'\W)] 
= 2E{W^) -E[2WE{W'\W)] 

= 2E{W'^) - E 



2W {(1- -)W + - 
' n n 



-E{W'^ 



The third equality used that W and W have the same distribution. The fourth equality used part 
1, and the final equality used part 2. Now parts 2 and 3 imply that E[D^] = |. Thus E{W^) = 2, 
which together with part 2 implies that Var{W) = 1. 
For part 5, note by the construction of {W, W') that 

E(L>^|i) = K{i,i + l)[W{i + l)-Wii)]'^ + K{i,i-l)[W{i-l)-Wii)]'^ 



{n + l-i){n-2i){^-2)' ^ ^(n - 2z + 2) (2 - ^) 



4 



n(n + 1 - 2i) n{n + l- 2i) 

Elementary simplifications complete the proof of part 5. 
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Part 6 will follow from part 5. If = 0, then E[Z)^|Vl^] = so part 6 is valid in this case. If 
W ^0, then by the definition of W it follows that W > ^. Note that 

256 64 / . QW 4 \ 256Ty2 2f>m 



It is easy to see that -3W^ + ^ + ^ <OifW> |, implying that E[D4|M/] < ifW / 0. □ 

Proof of Proposition 14. 6l One applies Theorem ll.2l to the pair {W, W). By Part 1 of Lemma 
14.71 the hypotheses are satisfied with \ = ^. 

Consider the first error term in Theorem II. 2[ By parts 3 and 4 of Lemma 14.71 

E|2AI^ -E[Z)2|VF]| _ 2 



2Xt tn 
2 



—mw - 1 



< — VE(iy-i)2 

tn 
2 

tn 

Consider the second error term in Theorem II. 2[ By the Cauchy-Schwarz inequality, 



E|L»|3 < v^E[D2]E[Z)4]. 

Taking expectations in part 3 Lemma gives that E[D2] = ^_ Taking expectations in part 5 of 
Lemma [47l gives that EfL)"^] = — < Thus the second error term in Theorem 11.21 is at 

most ^—r^ -. 

To bound the third error term in Theorem 11.21 one applies Theorem 13.11 with k = 2. Note from 
part 4 of Lemma [4.71 that E|VF — 1| < y^E(T4^ — 1)^ = 1. It is necessary to bound 

ei(t) =E[E(Z)2|VI^)I{E(L>2|1^) > 2A(VF + t) or E(D^|VF) > ^}?{k^W'^ ^ K^t^)]\. 

Part 3 of Lemma SZl implies that '&\D'^\W)\ > 2X{W + t) if and only if (W" - 1) < -f^. Part 5 of 
Lemma 14.71 implies that 

can happen only if = 0. Thus 



(22) ei{t) < E 



E\D^\W]l\W -1 < -y 



+ F{W = 0)E[D^\W = 0]. 



To bound the first term in (j22p . note by part 3 of Lemma 14.71 that 

E[D^\W] = 2A + (2A - X^){W - 1). 

Since n > 2, one has that 2A - A^ > 0. It follows that if TV - 1 < 0, then E[Z)2|1^] < 2A. Hence 
the first term in (p2]) is at most f P(M^ — 1 < — ^). By Chebyshev's inequality, this is at most 
To bound the second term in (I22|) . note that ¥{W = 0) < ^. Also part 3 of Lemma 14.71 gives that 
E[L'2|I^ = 0] = so that the second term in (|22|) is at most Summarizing, we have shown 
that ei(t) < f (1 + ^). 

It is also necessary to bound 

e2(t) = F{E{D^\W) < 2X{W - U) or E{D^\W) > AX^{k^W^ + kH^)}. 
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Part 3 of Lemma 14.71 gives that 



if and only if (Vl^ — 1) > Since W has mean and variance 1, Chebyshev's inequahty imphes that 
this occurs with probabihty at most . By part 5 of Lemma 14.7^ 

imphes that W = 0. Since ¥{W = 0) < ^, it follows that e2{t) < ^ + 

Smumarizing, the bounds on E|VF — 1|, ei(t), €2{t) give that the third error term in Theorem 11.21 
is at most max{i,t — } S is a universal constant. Adding this to the first two error terms 

completes the proof. □ 

4.3. Lower bound. The purpose of this subsection is to prove the lower bound from Theorem ll.4l 
in the introduction. 

Proposition 4.8. There is a sequence ofn's tending to infinity, and corresponding tn's such that 

\nW < tn) -F{Z<tn)\ = ^ + 0{l/n). 



Proof. Given n, define i = \^ — ^/n\ and t„ = ^" ^^^2" ^^"'"^^ . The sequence of n's will consist of 
even perfect squares; then i = \^ — ^/n\ = § — integer and the ceiling function can be 

ignored. 



Clearly 



(^>tn) = e(-^-^)=e-2fl- A+o(i) 

' 'n n 



Also 



Note that for integral a, 



nw>t^)=ni<'^-v^) = -^^- 



(i) (i + f)M(i + ^) 

= ^ c^^?r,^fi°g(i-#)-i°g(i+#: 

(1 + 2a) 



Since a = -y/n, one obtains that 



nW>tn) = ^e-'+^+''(") 

= e-2 + 0(l/n), 

and the result follows. □ 
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Remark: Similar ideas give another proof of an 0{n ""^Z^) upper bound for |P(M^ < t) — P(Z < 
when t is fixed. This argument was sketched to us by a referee of a much earlier (2006) version of 

this paper, and goes as follows. The first step is to consider t = where j is integral. Then 

( " ) " 
^{W > t) = J„\ . From page 1077 of [O], one has the asymptotics 

(23) ^ = e-^-«(^) 



for j < n/4. Since t is fixed, one has that j = 0(n^/^) and so 

(24) \F{W >t)- ¥{Z > t)\ = |e"^+^(tf ^ - e "''n^'' | = Oin-^^). 

The second step is to give a discretization argument allowing one to also use non- integral j. 
The point is that for fixed t and n growing, one can find an integer j such that '^'^^ ^^'^ ^ t < 
2[(i+i)^+(i+i)] _ j ^ o(^i/2)^ one easily checks that 



^ ^ 2if+j) \ _ p > 2[(j + l)^ + (i + l)] 



(25) 

n / V n 

and (using ([23|) ) that 



0(n 



-1/21 



(26) 



^> 2(,^+,) X_/^^ 2[(, + l)^ + (, + !)] 



n / \ n 



(n/2-j) (n/2-i-l) 
U/2/' 



0(n 



-l/2^ 



\n/2/ Vra/2 

The 0(n"^/2) upper bound for \F{W <t)- F{Z < t)\ with arbitrary t > fixed follows from (f24l) . 
(El, and (El. 



Appendix A. Exchangeable Pair and Moment Computations: Algebraic Approach 

The purpose of this appendix is to explain an algebraic approach to the construction of the 
exchangeable pair (W, W') in Subsection 14 . 2 1 and to the moment computations in Lemma [4.71 Since 
the exchangeable pair in Subsection 14.11 is related to that of Subsection 14.21 (see the discussion in 
Subsection 14. 2p . this appendix gives insight into that exchangeable pair too. Throughout we give 
results for the Johnson graph J{n,k), as this contains the Bernoulli-Laplace Markov chain as a 
special case k = ^. 

Let G be a finite group and K a subgroup of G. One calls (G, K) a Gelfand pair if the induced 
representation 1^ is multiplicity free. For background on this concept, see Chapter 3 of [Dj, Chapter 
7 of pc], or Chapters 19 and 20 of [H]. 

Suppose that (G, K) is a Gelfand pair, so that 1^ decomposes as 0^=q Vi, where Vq is the trivial 
module. Letting di be the dimension of Vi, one can define a probability measure tt on {0, • • • , s} by 
7r(z) = Associated to each value of i between and s is a "spherical function" Wj, which is 

a certain map from the double cosets of -fC in G to the complex numbers. Hence tt can be viewed 
as a probability measure on spherical functions. 

The spectrum of the Johnson graph J(n, k) can be understood in the language of spherical func- 
tions of Gelfand pairs; this goes back to [DSj . which used this viewpoint to study the convergence 
rate of random walk on J(n, k). To describe this, suppose without loss of generality that < k < ^. 
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Let G be the symmetric group Sn, and K the subgroup Sk x Sn-k- Then the space G/K is in 
bijection with the vertices of the Johnson graph. There are k + 1 spherical functions {loq, ■ ■ ■ , Uk}, 
and the dimension di is equal to (") — {^^i) ii 1 < i < k and to 1 if « = 0. The double cosets 
Kq, Ki, - ■ ■ , Kk of K in G are also indexed by the numbers 0, 1, • • • ,k; the double coset correspond- 
ing to j consists of those permutations r in Sn such that |{1, • ■ ■ i ^} H {t(1), • • • , t(/c)}| = k — j. 
Letting uJi{j) denote the value of uJi on the double coset indexed by j, it is known that 

{~'^)m{i — n — l)m(~j)m 

^ {k - n)m{-k)mm\ 

171=0 

where (j)m = i(j + 1) • • • (j + m- — 1) for m > 1 and (j)o = 1. The spectrum of random walk on 
the Johnson graph consists of the numbers LVi{l) with multiplicity di. 

Specializing to /c = ^ in the previous paragraph, the random variable W studied in Section [J] 
is equal to W{i) = §a;j(l) + 1, so up to constants is a random spherical function of the Gelfand 
pair {G,K). Section 4 of the paper [F4j used Stein's method to study random spherical functions 
of Gelfand pairs. Although the examples studied there were all for normal approximation, many 
of the results are general. For example, an exchangeable pair (TV, PF') was constructed using a 
reversible Markov chain. Specializing to the Gelfand pair corresponding to J{n,k), the Markov 
chain is on the set {0, 1, • • • ,k} and transitions from i to j with probability 

, k 

I I r=0 

Proposition lA.ll proves that the Markov chain L is a birth-death chain (and specializes to the 
birth-death chain of Subsection 14.21 when k = This is interesting, since from the definition of L 
it is not even evident that it is a birth-death chain. 

Proposition A.l. The Markov chain L on the set {0, 1, • • • ,k} is a birth-death chain with tran- 
sition probabilities 

n{n -|- 1 — i){n — i — h){k — i) 



L{i,i + 1) 
L{i,i- 1) = 
L{i,i) = 



k{n - k){n + 1 - 2i){n - 2i) 
in[n + \ — i — h){k + \ — i) 
k{n -k){n + 2-2i){n + l- 2i) 
i{n + l- i){n - 2k)'^ 



k{n - k){n - 2i){n + 2 - 2i) 

Proof. The spherical function uJi{Kr) is the Hahn polynomial Qn{x; a, f3, N) where x = r,n = 
i,N = k,a = k — n — 1,(3 = —k — 1. Properties of these polynomials are given on pages 33-34 of 
|KoSw| . In particular, they satisfy a recurrence relation 

-ruJi{Kr) = Aiu;i+i{Kr) - {Ai + Bi)uJi{Kr) + BiU)i-i{Kr) 

where 

^ (n + 1 — i){n — k — i){k — i) 
' ~ (n + 1 - 2i){n - 2i) 

and 

i{n + l- k- i){k + l-i) 



Bi 



{n + 2-2i){n + l-2i) 
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Since iOi{Kr) = 1 — ^(^Lk) ■• follows that 

,,IK\,(K\ - n{n + l-i){n-i-k){k-i) _ 

UJi[Kr)iVi[Kr) - — T— — —UJ^+l[Kr) 

k{n — k){n + 1 — 2i)[n — 2i) 

i{n + l-i){n-2ky 
^k{n - k){n - 2i){n + 2 - 2i)'^'^ 

in{n + 1 — i — k){k + 1 — i) tT^\ 
^k{n - k){n + 2 - 2i){n + 1 - 2i)^'-^^ 

The result now follows immediately from the orthogonality relations for Hahn-polynomials [KoSw] . 
which are a special case of the orthogonality relations for spherical functions of a Gelfand pair 
[Ma. □ 



To conclude, we note that there is an algebraic way to compute the moments E(VF' — VF)™" and 
the conditional moments E[(VK' — The interesting point about this approach is that it does 

not require one to explicitly compute the transition probabilities of the Markov chain L, or even to 
know that in this particular case it is a birth-death chain. Moreover, some of the quantities which 
appear have direct interpretations in terms of random walk on the Johnson graph. 

To be precise. Lemma 4.12 of jF4j implies that E(H^' — W)"^ is equal to 

7^ I \ rn/2 rn / \ * i r^l 



)m/z "t / \ s I 



11/ ^ ^ r=0 

Here pj{Kr) is the chance that random walk on the Johnson graph J(n, k) started at a particular 
vertex, is distance r away from the start vertex after j steps. Also, the proof of the lemma gives 
that E[iW' - W)"'\i] is equal to 

V I I / V / 

These expressions are easily evaluated for small m, and one obtains another proof of Lemma |4.7[ 
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